Building Test Suites for UIMA Components
نویسندگان
چکیده
We summarize our experiences building a comprehensive suite of tests for a statistical natural language processing toolkit, ClearTK. We describe some of the challenges we encountered, introduce a software project that emerged from these efforts, summarize our resulting test suite, and discuss some of the les-
منابع مشابه
CFE - A System for Testing, Evaluation and Machine Learning of UIMA Based Applications
There is a vast quantity of information available in unstructured form, and the academic and scientific communities are increasingly looking into new techniques for extracting key elements finding the structure in the unstructured. There are various ways to identify and extract this type of data; one leading system, which we will focus on, is the UIMA framework. Tasks that are often desirable t...
متن کاملIntegrated Tools for Query-driven Development of Light-weight Ontologies and Information Extraction Components
This paper reports on a user-friendly terminology and information extraction development environment that integrates into existing infrastructure for natural language processing and aims to close a gap in the UIMA community. The tool supports domain experts in data-driven and manual terminology refinement and refactoring. It can propose new concepts and simple relations and includes an informat...
متن کاملUsing UIMA to Structure An Open Platform for Textual Entailment
EXCITEMENT is a novel, open software platform for Textual Entailment (TE) which uses the UIMA framework. This paper discusses the design considerations regarding the roles of UIMA within EXCITEMENT Open Platform (EOP). We focus on two points: a) how to best design the representation of entailment problems within UIMA CAS and its type system. b) the integration and usage of UIMA components among...
متن کاملCSE Framework: A UIMA-based Distributed System for Configuration Space Exploration
To efficiently build data analysis and knowledge discovery pipelines, researchers and developers tend to leverage available services and existing components by plugging them into different phases of the pipelines, and then spend hours to days seeking the right components and configurations that optimize the system performance. In this paper, we introduce the CSE framework , a distributed system...
متن کاملMaking UIMA Truly Interoperable with SPARQL
Unstructured Information Management Architecture (UIMA) has been gaining popularity in annotating text corpora. The architecture defines common data structures and interfaces to support interoperability of individual processing components working together in a UIMA application. The components exchange data by sharing common type systems—schemata of data type structures—which extend a generic, t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009